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Abstract 

A  geometric  framework  for  the  recognition  of  three-dimensional  objects  represented 
by  point  clouds  is  introduced  in  this  paper.  The  proposed  approach  is  based  on 
comparing  distributions  of  intrinsic  measurements  on  the  point  cloud.  In  particu¬ 
lar,  intrinsic  distances  are  exploited  as  signatures  for  representing  the  point  clouds. 
The  first  signature  we  introduce  is  the  histogram  of  pairwise  diffusion  distances 
between  all  points  on  the  shape  surface.  These  distances  represent  the  probability 
of  traveling  from  one  point  to  another  in  a  fixed  number  of  random  steps,  the  av¬ 
erage  intrinsic  distances  of  all  possible  paths  of  a  given  number  of  steps  between 
the  two  points.  This  signature  is  augmented  by  the  histogram  of  the  actual  pairwise 
geodesie  distances  in  the  point  cloud,  the  distribution  of  the  ratio  between  these 
two  distances,  as  well  as  the  distribution  of  the  number  of  times  each  point  lies  on 
the  shortest  paths  between  other  points.  These  signatures  are  not  only  geometric 
but  also  invariant  to  bends.  We  further  augment  these  signatures  by  the  distribu¬ 
tion  of  a  curvature  funetion  and  the  distribution  of  a  curvature  weighted  distance. 
These  histograms  are  compared  using  the  oi'  other  common  distance  metrics  for 
distributions.  The  presentation  of  the  framework  is  accompanied  by  theoretical  and 
geometric  justification  and  state-of-the-art  experimental  results  with  the  standard 
Princeton  3D  shape  benchmark,  ISDB,  and  nonrigid  3D  datasets.  We  also  present 
a  detailed  analysis  of  the  particular  relevance  of  each  one  of  the  different  proposed 
histogram-based  signatures.  Finally,  we  briefly  discuss  a  more  local  approach  where 
the  histograms  are  computed  for  a  number  of  overlapping  patches  from  the  object 
rather  than  the  whole  shape,  thereby  opening  the  door  to  partial  shape  comparisons. 
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distributions. 
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1  Introduction  and  Key  Contributions 


Three-dimensional  (3D)  data  is  becoming  more  and  more  ubiquitous.  3D  ob¬ 
ject  retrieval  is  essential  for  tasks  such  as  navigation,  target  recognition,  and 
identihcation.  In  particular,  point  clouds  are  one  of  the  most  primitive  and 
fundamental  representations  of  3D  objects,  obtained,  e.g.,  from  laser  range 
scanners,  and  working  directly  with  such  representation  is  critical  and  chal¬ 
lenging  at  the  same  time.  See  for  example  [9,15,19,22,27,30,31,33]  and  refer¬ 
ences  therein  for  some  of  the  recent  works  in  this  area.  In  this  paper,  we  develop 
a  framework  for  3D  object  recognition  from  point  cloud  data.  In  particular, 
we  introduce  and  exploit  signatures  which  extract  the  intrinsic  geometry  of 
the  3D  shapes  represented  by  the  point  cloud. 

The  diffusion  distance,  [18],  and  the  geodesic  distance  are  two  intrinsic  (geo¬ 
metric)  distances  measured  by  paths  constrained  to  travel  on  the  point  cloud 
surface  of  the  shapes,  and  are  the  key  components  of  the  framework  here  pro¬ 
posed.  The  diffusion  distance  is  related  to  the  probability  of  traveling  on  the 
surface  from  one  point  to  another  in  a  hxed  number  of  random  steps,  while 
the  geodesic  distance  is  the  length  of  the  shortest  surface-path  between  two 
points. 

Being  invariant  to  bending  of  the  surface  makes  these  intrinsic  distances  nat¬ 
ural  and  useful  for  recognition  of  non-rigid  objects,  see  e.g.,  [5,14,16,23,38]  for 
the  use  of  the  geodesic  distance.  While  in  order  to  obtain  an  explicit  match¬ 
ing  of  the  shapes,  the  matrices  corresponding  to  pairwise  distances  need  to 
be  compared  and  matched  [4,23],  it  has  been,  at  least  empirically,  demon¬ 
strated  that  such  computationally  elaborate  matchings  can  be  often  avoided 
in  recognition  tasks.  In  particular,  in  [3],  the  authors  have  shown  that  with 
high  probability,  shapes  can  be  uniquely  distinguished  by  the  distribution  of 
Euclidean  (non-intrinsic)  distances  between  pairs  of  points  (samples  on  the 
shapes).  The  diffusion  distance  is  equivalent  to  the  Euclidean  distance  in  an 
embedding  space,  as  detailed  in  Section  2.1,  which  makes  this  argument  about 
distributions  applicable  to  diffusion  distances  in  the  embedding  space  as  well. 
This  argument  combined  with  the  need  for  a  bending  invariant  signature, 
provides  a  solid  reason  to  consider  the  distributions  of  intrinsic  distances  as 
signatures  for  object  retrieval.  Comparing  the  distributions  of  distances  (or 
any  other  features),  instead  of  applying  traditional  global  matching  methods, 
reduces  the  recognition  problem  to  a  one-dimensional  comparison  problem, 
considerably  saving  memory  and  computational  time  [14,16,20,24,28,37]. 

In  real  complex  3D  scenarios,  objects  are  often  noisy  and  partially  occluded  or 
not  completely  scanned.  It  is  therefore  important  to  perform  such  3D  recogni¬ 
tion  robustly  and  from  partial  information  (see  also  [13,26]  for  partial  match¬ 
ing  results).  Graph-based  methods  in  the  object  recognition  literature,  e.g.. 
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see  [17]  for  Reeb  graphs  comparison  and  [29]  for  object  recognition  in  videos, 
have  been  shown  useful  for  partial  matching  based  on  local  shape  patches.  Ex¬ 
ploiting  these  graph-based  matching  techniques,  combined  with  the  intrinsic 
distance  distributions  here  proposed,  the  introduced  framework  starts  building 
in  this  direction  of  partial  matching. 

Motivated  by  these  prior  theoretical  and  computational  results,  in  this  pa¬ 
per  we  introduce  and  exploit  distribution/histogram-based  signatures  for  3D 
shape  recognition,  and  develop  methods  for  global  and  local  comparison  be¬ 
tween  shapes  represented  by  point  clouds.  The  hrst  signature  we  introduce  is 
the  distribution  of  the  diffusion  distance  [7,18],  which  has  not  been  used  before 
for  comparing  3D  surfaces.  This  distance  basically  measures  the  probability  of 
connectivity  between  points,  considering  all  possible  surface-constrained  paths 
between  them  and  not  just  the  shortest  one.  The  diffusion  distance,  which  is 
easily  computed  from  eigenvalue/eigenvector  decompositions,  is  more  robust 
than  the  natural  geodesic  distance  to  topological  noise  in  the  point  cloud  data, 
as  well  as  topological  errors  created  in  the  process  of  computing  local  neigh¬ 
borhoods  due  to  the  lack  of  connectivity  information.  The  combination  of 
both  geodesic  and  diffusion  distances  also  helps  to  better  dehne  these  neigh¬ 
borhoods,  as  demonstrated  in  this  paper.  We  also  use  as  signatures  the  dis¬ 
tribution  of  pairwise  geodesic  distances  (the  feature  that  has  not  been  used 
before  for  point  cloud  data),  and  the  distribution  of  the  ratio  between  diffusion 
and  geodesic  distances.  This  ratio  is  a  measure  of  the  width  of  the  shape  in 
the  parts  connecting  the  two  points  being  considered  in  the  computation.  We 
further  introduce  a  measure  of  “centrality”  for  each  point,  which  is  the  num¬ 
ber  of  shortest  paths  between  pairs  of  points  that  include  the  corresponding 
point,  and  use  the  distribution  of  this  measure  as  an  additional  signature  in 
this  work.  All  the  above  signatures  are  not  only  intrinsic  to  the  object,  but 
invariant  to  bends  as  well.  We  also  include  the  histogram  of  a  curvature  func¬ 
tion  and  the  distribution  of  a  curvature  weighted  distance  in  our  signatures  in 
order  to  further  improve  the  recognition  performance.  The  relative  contribu¬ 
tion  of  each  one  of  these  histogram-based  geometric  signatures,  which  are  all 
used  here  for  the  first  time  in  a  framework  for  point  cloud  shape  recognition 
(and  some  like  those  associated  with  the  diffusion  distance  for  the  first  time 
for  3D  recognition  in  general),  is  investigated  in  this  work. 

To  compare  these  signatures  for  different  shapes,  both  and  Jensen-Shannon 
divergence,  [11],  produce  very  good  results.  In  particular,  the  results  here 
reported  based  on  the  These  results  are  state-of-the-art  for  the  standard 
datasets. 

In  addition  to  these  global  comparisons,  and  in  order  to  develop  a  framework 
that  is  more  geared  toward  finding  local  shape  similarities,  we  also  propose 
a  method  based  on  the  computation  of  these  signatures  on  “patches”  of  the 
point  cloud  data  (see  also  [12,25,26]).  In  our  approach,  and  following  [25],  we 
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use  random  overlapping  patches  on  the  shape,  with  a  control  on  the  amount 
of  overlap.  In  contrast  to  the  more  classical  literature  on  patches,  we  explicitly 
consider  their  spatial  relationship  by  using  a  graph-based  approach. 

The  remainder  of  this  paper  is  organized  as  follow:  in  Section  2,  we  discuss  the 
basic  concepts  on  the  diffusion  distance  and  the  curvature  classiher.  Then,  we 
describe  the  distribution  signatures  we  develop  based  on  these  features,  and 
the  technique  to  compare  these  signatures  in  Section  3.  Experimental  results 
are  presented  in  Section  4,  and  in  Section  5,  we  discuss  a  graph-oriented  local 
framework  and  conclude  the  work. 

A  preliminary  version  of  this  paper  appeared  at  a  workshop,  [21].  Here  we 
extend  the  framework  by  adding  fundamental  new  signatures  that  improve 
the  results,  provide  additional  details,  and  present  additional  examples. 


2  Basic  Intrinsic  Measures 


2.1  Diffusion  Distance 


In  [7,18]  (see  also  [2]  for  related  work),  the  authors  introduced  diffusion  maps 
and  diffusion  distances  as  a  method  for  data  parametrization  and  dimension¬ 
ality  reduction.  The  diffusion  distance  is  equivalent  to  the  Euclidean  distance 
in  the  embedding  space  corresponding  to  a  mapping  known  as  diffusion  map. 
The  diffusion  distance  between  two  points  in  the  point  cloud  involves  the 
average  of  all  the  paths  of  m  steps  connecting  these  two  points  (average  prob¬ 
ability  of  traveling  between  the  points).  This  makes  the  diffusion  distance  a 
bending  invariant  function  of  the  path  length  and  the  shape  width  between 
two  points.  Since  this  distance  does  not  rely  on  just  the  shortest  path  between 
two  points,  it  is  more  robust  than  the  geodesic  distance.  As  briefly  mentioned 
before,  in  [3]  the  authors  proved  that  the  distribution  of  Euclidean  distances 
is  very  informative  of  the  shape.  Combining  the  theory  in  [3]  and  the  char¬ 
acteristics  of  diffusion  distances,  such  as  being  the  Euclidean  distance  in  an 
embedding  space  and  being  bending  invariant,  makes  the  diffusion  distance  a 
good  natural  signature  for  non-rigid  object  recognition. 

In  order  to  compute  the  diffusion  distance,  we  first  create  the  affinity  function 
k{x,  y)  over  all  pairs  of  points  x,  y  in  the  point  cloud.  These  values  become  the 
elements  of  an  iV  x  iV  square  matrix  77,  where  N  is  the  number  of  available 
points.  This  matrix  is  symmetric,  positive  semidehnite,  and  positive.  If  we 
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then  define  a(a;,  y)  as 


a{x,y) 


v{x)  ’ 


(1) 


where  v{x)  :=  J2y  k{x,y)  is  the  sum  of  the  elements  in  each  row,  the  matrix 
A,  composed  by  the  elements  a{x,y),  can  be  viewed  as  the  probability  for  a 
random  walker  on  the  point  cloud  to  make  a  step  from  x  to  y.  Now  if  we 
further  define  d{x,y)  as 


d{x,y) 


a{x,y). 


v{x) 

^  v{y)' 


(2) 


the  corresponding  matrix  A  is  symmetric  and  can  be  decomposed  as 

N 

Hx,y)  =  ^  Ai0i(a;)0i(|/), 
i=0 


(3) 


where  Aq  =  1  >  are  the  eigenvalues  (note  the  “square,” 

which  will  simplify  the  expressions  later),  of  the  matrix  A  and  <pi  are  the 
corresponding  eigenvectors.  Therefore,  for  the  elements  of  the  matrix  A"^  we 
obtain 

N 

d^^\x,y)  =  (4) 

i=0 


which  can  be  interpreted  as  representing  the  probability  for  a  random  walker 
or  Markov  chain  with  transition  matrix  A  to  reach  y  from  a;  in  m  steps. 


Following  in  part  standard  concepts  from  kernel  methods,  the  authors  in  [7] 
introduced  the  diffusion  map  (<hm)  from  the  given  point  cloud  data  to  an 
Euclidean  space  using  the  kernel  This  mapping  is  obtained  as 


V  / 


(5) 


It  is  easy  to  prove,  e.g.,  see  [34]  for  more  details  on  these  kernel  methods,  that 
the  Euclidean  distance  between  the  mapped  points  <hm(a;)  and  in  the 
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new  space  is 


0'L(x,v)  =  a^"'\x,x)  +af’'‘\y,y)  -2a‘’"\x,y), 


(6) 


which  is  exactly  the  diffusion  distance  between  points  x  and  y.  (The  selected 
values  for  m  and  other  parameters  are  presented  in  Section  4.) 

In  order  to  separate  the  geometry  of  the  point  cloud  from  its  density,  k{x,y) 
is  further  normalized,  [18], 


k{x,y) 


k{x,y) 
p{x)p{y) ’ 


(7) 


where  p{x)  :=  J2y  k{x,y),  and  k  is  used  in  Eq.  (1)  instead  of  k. 

In  this  work,  we  first  use  the  Gaussian  kernel  k{x,y)  =  exp(— ||a;  — 
to  define  the  affinity  matrix,  where  a  is  the  average  of  Euclidean  distances 
between  all  pairs  of  points  in  the  shape.  As  a  result  of  using  Euclidean  distances 
to  define  this  affinity  kernel,  we  have  topological  shortcuts  in  computing  the 
diffusion  distance.  This  is  illustrated  in  Figure  1,  where  some  points  on  the 
legs  of  the  dog  are  so  close  to  each  other  in  terms  of  Euclidean  distance  that 
the  “shortcut”  leads  to  an  undesired  (and  incorrect)  small  diffusion  distances 
between  the  two  adjacent  back  legs.  One  possible  solution  would  be  to  reduce 
the  value  of  a.  However,  with  a  small  a,  many  points  become  isolated  and 
their  diffusion  distance  to  all  other  points  becomes  too  large.  To  avoid  such 
shortcuts,  we  first  compute  the  geodesic  distance  between  all  the  points  in 
the  shape,  computation  done  using  Floyd’s  algorithm  on  the  graph  obtained 
from  connecting  only  a  few  nearest  neighbors,  3-6  neighbors  in  our  case  (an 
alternative  technique  is  given  in  [22]  which  works  directly  on  the  point  cloud). 
Then,  for  each  point  x  we  find  the  set  Ai{x)  of  nearest  neighbors  of  x,  in 
terms  of  geodesic  distance.  Then,  we  define  k{x,y)  by  a  neighborhood  filtering 
as 


k{x,y) 


(  l|a;-y[r\ 

’  y  G  M.{x), 
0  y  ^  M{x). 


(8) 


See  in  Figure  1  how  this  addresses  the  shortcuts  problem. 

This  concludes  the  presentation  of  the  diffusion  distance,  and  we  now  proceed 
to  present  the  basic  concepts  of  the  curvature  classifier. 
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Fig.  1.  In  both  pictures  the  colors  show  the  diffusion  distance  for  all  the  points 
from  a  fixed  point  in  one  of  the  legs  of  the  dog  (dark  blue  for  small  and  dark  red 
for  large  values).  The  left  picture  shows  the  case  without  neighborhood  filtering  in 
computing  the  diffusion  distance,  obtaining  undesired  shortcuts  (see  how  the  back 
legs  are  considered  close).  In  the  right  figure  we  observe  how  these  shortcuts  are 
avoided  by  using  the  neighborhood  filtering  based  on  the  geodesic  distance.  (This  is 
a  color  figure.) 

2.2  Curvature  Classifier 


We  now  describe  a  local  surface  classifier  introduced  in  [6] ,  which  will  be  used 
to  augment  the  discriminatory  power  of  the  diffusion  and  geodesic  distances. 
This  classifier  robustly  distinguishes  between  smooth  regions  and  edges  or 
corners.  While  the  distributions  of  intrinsic  distances  and  their  ratio  ignore 
small  parts  on  the  shape  which  have  high  curvature,  using  the  distributions  of 
a  function  of  the  curvature  and  a  curvature  weighted  distance,  as  additional 
signatures,  helps  in  recognizing  these  parts. 

If  M  is  the  considered  surface  and  Bffx)  is  an  Euclidean  ball  with  radius  e 
centered  at  a  point  x,  we  define  the  zero  moment  of  the  e-neighborhood  of  x 
as 


M^{x)  :=  j  xdx, 


(9) 


and  its  first  moment  as 

Ml{x)  :=  fB,(x)nM{x  -  M°{x))  ®  {x  -  M^{x))dx 
=  lB4x)nM  x(^x-  M^{x)  ®  M0{x)dx, 


(10) 


where  y^z  :=  {yiZj)ij=i^2,3-  These  moments  are  expected  to  be  robust  to  noise, 
and  provide  information  about  the  curvature  at  x,  using  the  eigenvalues  of  the 
first  moment  and  the  zero  moment  shift  defined  as 

Tffx)  :=  M^{x) -X.  (11) 
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For  example,  T^{x)  scales  quadratically  with  the  hlter  width  e  in  smooth  areas 
and  linearly  at  corners  and  edges.  The  following  function  of  these  moments  is 
then  used  as  a  measure  of  curvature: 


a 


G 


'mwK 

^Arr)/7.r 


(12) 


where  Xmin  and  Xmax  are  the  minimum  and  maximum  eigenvalues  of  the  first 
moment  at  point  x,  respectively.  In  particular,  we  consider  G(s)  =  a+fSs'^  ’ 
with  appropriately  chosen  a  and  /?.  In  our  application,  we  have  set  a  =  .002 
and  f3  =  2000.  The  value  of  C^{x)  will  be  close  to  -  at  smooth  areas  and 
C<,{x)  <<  ^  at  corners  and  edges. 

Having  the  basic  concepts  of  intrinsic  distances  and  curvature  functions,  we 
now  proceed  to  present  the  signatures  derived  from  them  and  the  proposed 
recognition  framework. 


3  Recognition  Framework 


In  this  section,  we  present  the  signatures  we  use  in  order  to  recognize  3D 
objects  represented  by  point  clouds,  and  the  techniques  for  comparing  between 
these  signatures  in  different  shapes. 


3.1  Characterizing  Signatures 


In  this  part,  we  present  six  characterizing  signatures  which,  except  for  the 
histogram  of  the  geodesic  distance,  which  has  been  used  but  for  meshes,  have 
not  been  previously  used  in  3D  object  recognition. 

Histogram  of  diffusion  distance.  As  our  hrst  signature,  we  use  the  his¬ 
togram  of  diffusion  distance,  motivated  by  the  discussion  in  Section  2.1.  Being 
bending  invariant,  similar  to  geodesic  distance,  it  has  the  advantage  of  being 
more  robust  to  noise  since  it  exploits  all  the  paths  of  hxed  number  of  steps, 
not  only  the  shortest  one  as  in  geodesic  distance. 

Histogram  of  geodesic  distance.  As  mentioned  above,  the  geodesic  dis¬ 
tance  is  the  length  of  the  shortest  path,  constrained  to  the  manifold,  between 
two  points.  Works  such  as  those  in  [14,16]  have  used  the  histogram  of  the 
average  geodesic  distance  from  a  point  to  the  rest  as  a  signature  for  shape 
recognition  (primarily  for  meshes).  This  is  motivated  in  part  by  the  fact  that 
geodesic  distances  are  the  basic  bending  invariant  features  of  the  shape,  and 


thereby  useful  for  non-rigid  object  recognition  [23].  When  compared  with  the 
diffusion  distance,  the  geodesic  distance  is  more  sensitive  to  noise,  and  it  is 
thereby  used  here  to  augment  the  other  features,  and  not  alone.  We  compute 
this  distance  by  Floyd’s  algorithm,  while  we  could  also  use  the  work  in  [22] 
to  compute  it  directly  on  the  point  cloud.  To  avoid  shortcuts,  we  start  with 
three  nearest  neighbors  in  the  neighborhood  graph  and  increase  it  by  one  in 
each  step,  until  the  constructed  graph  is  connected  or  it  reaches  a  maximum 
number. 

Histogram  of  the  ratio  between  diffusion  and  geodesic  distances.  The 

diffusion  distance  contains  information  about  the  “width”  of  the  object  in  the 
area  connecting  two  points  by  considering  the  nnmber  of  paths  with  a  fixed 
nnmber  of  steps  between  them,  in  addition  to  their  distance  on  the  manifold. 
Since  the  geodesic  distance  is  the  length  of  the  shortest  path  between  two 
points,  the  ratio  between  the  diffnsion  distance  and  the  geodesic  distance  pro¬ 
vides  information  about  the  average  width  in  the  path  between  the  two  points. 
The  histogram  of  this  ratio  is  the  third  signature  considered  here.  Since  for 
small  geodesic  distances,  the  ratio  is  too  large,  we  have  excluded  the  distances 
that  are  smaller  than  a  threshold.  The  threshold  we  use  in  our  experiments 
is  three  times  the  average  of  the  smallest  nonzero  geodesic  distance  at  each 
point,  and  we  remove  all  pairs  of  points  with  a  geodesic  distance  less  than  this 
threshold. 

Histogram  of  a  centrality  measure.  One  of  the  characteristics  of  a  point  in 
a  3D  surface  is  its  intrinsic  centrality.  We  propose  a  new  fnnction  to  measnre 
the  centrality  of  each  point,  which  is  the  nnmber  of  shortest  paths  (geodesic 
curves)  between  all  the  pairs  of  points  in  the  shape  that  inclnde  the  specihc 
point  (to  avoid  noise  and  the  possible  effects  of  non-uniformity  of  the  samples, 
we  can  average  this  nnmber  in  a  K-neighborhood  of  each  point).  We  expect 
higher  values  of  this  measure  for  points  closer  to  the  center  or  in  the  center  of 
narrow  parts  (for  example,  legs  of  the  animals),  and  lower  values  for  the  end 
points.  In  the  proposed  point  cloud  recognition  framework,  the  histogram  of 
this  measure  for  all  the  points  in  a  shape  is  used  as  an  additional  signature. 

Histogram  of  the  curvature  classifier.  In  our  experiments,  we  noticed 
that  considering  only  the  histograms  of  bending  invariant  distances  neglects 
the  information  in  the  small  high  curvatnre  parts.  This  becomes  more  critical 
for  recognizing  classes  of  3D  objects  as  in  the  results  presented  in  Section  4,  and 
not  just  single  bended  representatives  per  class  as  in  [10,23].  For  this  purpose, 
we  propose  two  additional  new  signatures,  the  histogram  of  the  curvature 
classiher  described  in  Section  2.2,  and  the  histogram  of  a  curvature  weighted 
distance  (see  below  for  the  description  of  this  signature).  Since  there  are  a 
lot  of  low  curvature  points  in  each  shape  and  many  high  curvature  parts  are 
caused  by  noisy  or  non-smoothly  sampled  manifolds,  the  part  of  the  curvatnre 
histograms  corresponding  to  these  very  low  or  high  curvatnre  points  is  not 
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informative.  Thus,  disregarding  them  improves  the  results. 

Histogram  of  a  curvature  weighted  distance.  Following  the  above  discus¬ 
sion  about  considering  curvature  as  a  distinguishing  feature,  we  dehne  a  new 
distance  between  points  which  gives  larger  weights  to  the  points  with  higher 
curvature.  This  curvature  weighted  distance  is  computed  by  accumulating  a 
linear  decreasing  function  of  the  curvature  classiher,  explained  in  Section  2.2, 
over  the  shortest  paths  between  all  pairs  of  points  (natural  geodesic).  We  use 
the  histogram  of  these  distances  as  the  last  of  the  proposed  signatures. 

In  Figure  2  we  illustrate  the  diffusion  distance,  geodesic  distance,  their  ratio, 
the  curvature  weighted  distance,  and  the  curvature  classiher,  as  well  as  the 
centrality  measure,  for  a  few  examples.  In  Figure  3,  we  present  each  one  of  the 
six  distribution/histogram-based  signatures  for  some  representative  shapes. 


Fig.  2.  From  left  to  right  in  each  row:  The  value  of  the  diffusion  distances,  geodesic 
distances,  their  ratio,  and  the  curvature  weighted  distance,  from  a  point  (dark  blue) 
to  the  rest  of  the  3D  shape;  followed  by  the  value  of  the  curvature  classifier  and 
the  centrality  measure  for  all  points.  Dark  blue  represents  small  values  and  dark  red 
large  values.  (This  is  a  color  figure.) 


3.2  Signatures  Comparison 


To  conclude  the  description  of  the  global  shape  recognition  framework,  we 
must  describe  how  we  combine  and  compare  the  above  mentioned  histograms. 
In  order  to  compare  two  histograms,  which  are  automatically  normalized  to 
compensate  for  the  shape  scale,  we  tested  different  distance  measures,  such  as 
Li  and  L2  norms,  correlation  coefficients,  and  the  Jensen-Shannon  diver- 
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Fig.  3.  All  six  histograms  are  shown  following  the  respective  shapes.  The  histograms, 
from  left  to  right,  are  presented  in  the  order  described  in  the  text.  Colors  on  the 
shapes  correspond  to  the  geodesic  distance  from  one  point  on  the  shape  to  the  rest. 
(This  is  a  color  figure.) 


gence  (JSD),  which  is  the  symmetric  and  smoothed  version  of  the  Knllback- 
Leibler  divergence.  The  best  resnlts  were  obtained  for  the  measnre,  followed 
by  the  JSD.  Therefore,  in  onr  resnlts,  we  have  nsed  the  measnre  between 
two  normalized  Z-bins  distribntions,  hi  and  hj,  which  is  given  by 


.  1  ^  jhijz)  -  hj{z)f 

■  2^^  hi{z)  +  hj{z) 


Having  the  basic  way  to  compare  pairs  of  histograms,  now  we  need  to  combine 
the  distance  metric  for  the  six  signatures  presented  in  the  previous  section 
in  order  to  obtain  the  “dissimilarity”  between  two  shapes.  For  the  results 
presented  in  Section  4,  we  multiply  the  six  distances  obtained  for  each  one 
of  the  six  different  histograms.  This  leads  to  better  results  than,  for  example, 
considering  multidimensional  histograms  of  two  or  more  features. 


As  detailed  in  the  next  section,  this  simple  distance  between  histograms  al¬ 
ready  leads  to  state-of-the-art  results.  In  the  future  we  plan  to  further  investi¬ 
gate  replacing  the  x^  by  other  metrics,  and  also  other  ways  of  combining  the 
signatures,  including  automatically  learning  the  weights  and  relevance  of  each 
one  of  them. 
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4  Experimental  Results 


In  our  experimental  results,  for  comparison,  we  use  3D  shapes  from  the  same 
database  tested  in  [14],  which  is  the  combination  of  two  different  databases, 
part  of  the  Princeton  Shape  Benchmark  (PSB),  [35],  and  ISDB.  These  two 
databases  consist  of  22  categories,  overall.  We  also  present  the  results  tested 
on  a  nonrigid  3D  database  (NR)  [4], 

We  have  a  total  of  635  shapes  from  27  categories  in  the  three  databases. 
Since  our  proposed  recognition  techniques  do  not  rely  on  the  connectivity 
information  in  these  triangulated  data,  we  hrst  converted  them  to  point  clouds. 
We  have  uniformly  sampled  3000  random  points  from  vertices  of  each  shape, 
using  the  maxmin  sampling  method  in  [8] ,  after  subdividing  the  triangles  using 
the  Graph  toolbox  in  MATLAB  [32] .  Even  if  the  point  samples  of  a  shape  are 
non-uniform  but  large  enough,  3000  points  can  be  uniformly  sampled  without 
loss  of  generality  for  originally  non-uniform  point  clouds.  We  have  used  m  =  50 
for  the  number  of  steps  of  the  path  in  the  diffusion  distance,  g  =  100  nearest 
neighbors  to  hnd  Ai{x)  in  Eq.  (8),  and  only  the  6  largest  eigenvalues  of  A.  In 
computing  the  curvature  function,  we  used  8  nearest  neighbors  for  each  point 
and  dehned  e  as  the  maximum  Euclidean  distance  to  the  8-th  neighbor  of  all 
points.  Since  the  maximum  value  of  the  curvature  classiher  is  500  (based  on 
the  selected  values  of  a  and  P),  in  computing  the  curvature  weighted  distance, 
we  use  the  curvature  classiher  subtracted  from  500  at  each  point,  as  the  actual 
curvature  function.  All  six  histograms  have  50  bins.  For  the  curvature  classiher 
histogram,  considering  only  the  last  40  bins  leads  to  better  recognition,  as 
discussed  in  Section  3.1. 

In  order  to  evaluate  the  ehectiveness  of  the  diherent  signatures  and  methods, 
we  hrst  hnd  the  similarity  measure  between  each  pair  of  shapes  by  applying 
each  signature  to  form  a  square  matrix  of  dissimilarity  values.  We  use  the 
following  three  criteria  for  the  recognition  performance: 

Nearest  neighbor:  The  percentage  of  the  cases  where  the  query  belongs  to 
the  same  category  as  its  closest  match  (not  considering  the  query  itself). 

First  tier:  The  percentage  of  the  shapes  in  the  same  category  as  the  query 
that  are  among  its  U  closest  matches,  where  D  -|-  1  is  the  total  number  of 
shapes  in  the  corresponding  category. 

Second  tier:  This  value  is  the  same  as  in  the  hrst  tier  with  the  difference 
that  now  the  2U  closest  matches  are  considered. 

The  percentages  presented  here  are  the  average  values  of  these  measures  over 
one  category  or  the  whole  dataset.  Although,  the  commonly  used  hrst  and 
second  tiers  are  good  criteria  for  evaluating  recognition  methods  when  the 
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intraclass  variability  is  low,  it  can  be  a  misleading  measure  when  there  is  a 
lot  of  variability  in  the  classes.  For  example,  based  on  the  signatures  used  for 
recognition,  a  square  chair  without  handle  can  be  more  similar  than  a  round 
table  to  a  square  table.  In  this  case,  assume  we  have  the  same  number  of  square 
chairs,  square  tables,  and  round  tables  in  the  database;  and  the  tables  are  all 
in  the  same  category.  In  this  situation,  since  the  chairs  are  closer  matches  to 
square  tables  than  the  round  tables  are,  the  first  tier  of  square  tables  can  be 
close  to  50%.  On  the  other  hand,  if  the  tables  are  categorized  in  two  different 
categories,  with  exactly  the  same  signatures,  the  hrst  tier  for  square  tables 
can  be  increased  to  100%.  This  discussion  shows  that  the  amount  of  intraclass 
variability  in  different  databases  makes  a  big  difference  in  values  of  tiers.  The 
PSB  database,  that  is  reported  here,  has  a  large  intraclass  variability  in  many 
of  the  categories,  and  the  ISDB  and  NR  databases  have  lower  variability  within 
each  class.  Thus,  lower  values  for  tiers  is  expected  in  the  PSB  dataset. 

In  Table  1,  the  results  of  using  each  signature  as  well  as  some  combinations  of 
them  over  the  three  datasets  are  presented.  For  comparison,  we  also  included 
the  results  obtained  when  using  the  histogram  of  the  average  geodesic  distance 
from  each  point  to  every  other  point,  which  was  used  in  [14,16]  for  meshes. 
Among  the  single  signature  methods,  the  best  result,  considering  the  best 
match,  is  obtained  by  the  proposed  diffusion  distance,  and  the  best  overall 
result  is  obtained  by  combining  our  proposed  six  signatures.  In  Figure  4,  the 
best  matches  given  by  combining  these  six  signatures  are  presented  for  six 
representative  shapes. 

In  Table  2,  the  results  for  some  of  the  objects  categories  by  using  the  proposed 
global  comparison  (DCRGcDP)  and  the  state-of-the-art  CDF  method  [14], 
over  the  whole  dataset  used  in  [14],  are  presented.  In  the  table,  we  present 
the  results  for  some  of  the  22  classes,  containing  all  the  ones  reported  in  [14]. 
In  [14],  the  authors  use,  as  the  signature,  a  two  dimensional  histogram  of  the 
combination  of  the  average  geodesic  distance  (which  as  shown  in  Table  1  is 
not  as  good  as  diffusion  distance),  and  a  measure  of  diameter  of  the  shape 
around  each  point  over  the  triangulated  data. 

We  observe  that  the  overall  performance  of  our  method,  considering  the  best 
match,  over  the  whole  dataset  is  better  than  the  performance  of  the  CDF 
method,  which  reported  state-of-the-art  results  at  the  time  of  publication.  One 
can  observe  that  in  both  techniques,  the  categories  of  “humans,”  “horses,”  “hu¬ 
man  hands,”  and  “furniture”  have  the  highest  correct  recognition  rates.  We 
have  noticeably  better  results  in  categories  of  “airplanes,”  “humans,”  “ships,” 
“furniture,”  and  “fishes,”  showing  that  our  proposed  descriptors  better  capture 
the  intrinsic  characteristics  of  those  classes.  Having  a  very  diverse  collection 
of  models,  the  classes  “chairs,”  “tables,”  “insects,”  and  “helicopters”  show 
lower  performance.  Finally,  note  that  unlike  most  algorithms  reported  in  the 
literature,  including  [14],  we  do  not  rely  on  the  neighborhood  information  in 
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Total  (635) 

ISDB  (106) 

PSB  (381) 

NR  (148) 

BM 

FT 

ST 

BM 

FT 

ST 

BM 

FT 

ST 

BM 

FT 

ST 

D 

68% 

32% 

47% 

92% 

59% 

68% 

56% 

29% 

44% 

89% 

63% 

81% 

G 

57% 

32% 

48% 

74% 

45% 

65% 

52% 

32% 

47% 

79% 

45% 

66% 

mG 

52% 

29% 

45% 

82% 

52% 

73% 

47% 

29% 

44% 

73% 

42% 

65% 

R 

57% 

26% 

42% 

75% 

45% 

60% 

43% 

24% 

38% 

86% 

58% 

74% 

G 

43% 

28% 

43% 

69% 

49% 

68% 

40% 

23% 

37% 

81% 

57% 

76% 

cD 

50% 

24% 

38% 

71% 

45% 

63% 

35% 

20% 

32% 

84% 

55% 

72% 

P 

28% 

19% 

33% 

52% 

36% 

56% 

31% 

23% 

39% 

24% 

20% 

41% 

DC 

72% 

35% 

51% 

91% 

65% 

74% 

60% 

30% 

46% 

91% 

67% 

83% 

DR 

66% 

32% 

46% 

87% 

56% 

67% 

51% 

29% 

43% 

88% 

63% 

80% 

DG 

71% 

39% 

55% 

89% 

59% 

71% 

62% 

35% 

50% 

91% 

66% 

82% 

DRC 

73% 

35% 

51% 

92% 

63% 

73% 

60% 

32% 

47% 

90% 

66% 

82% 

DGR 

74% 

38% 

54% 

88% 

60% 

71% 

62% 

36% 

49% 

94% 

66% 

82% 

DCRG 

78% 

40% 

56% 

91% 

65% 

75% 

68% 

38% 

51% 

95% 

69% 

83% 

DCRGcD 

79% 

38% 

54% 

94% 

68% 

77% 

68% 

36% 

50% 

95% 

69% 

82% 

DCRmGcDP 

78% 

35% 

50% 

97% 

71% 

79% 

68% 

33% 

49% 

93% 

67% 

82% 

DCRGcDP 

80% 

38% 

53% 

95% 

70% 

79% 

71% 

37% 

51% 

95% 

69% 

82% 

Table  1 

Effectiveness  of  each  one  of  the  six  signatures  (plus  average  geodesic)  and  some 
of  their  combinations,  evaluated  with  the  global  comparison  method  over  the  three 
datasets:  ISDB,  PSB,  NR,  and  the  combination  of  all  of  them  (Total).  In  the  table, 
D  stands  for  diffusion  distance,  G  for  geodesic  distance,  mG  for  average  geodesic 
distance,  R  for  ratio  of  diffusion  and  geodesic  distances,  P  for  the  centrality  sig¬ 
nature,  C  for  curvature  classifier,  and  cD  for  the  curvature  weighted  distance.  The 
evaluation  measures  presented  here  are  best  match  (BM),  first  tier  (FT),  and  second 
tier  (ST). 


the  triangulated  data.  This  lack  of  information  leads  to  lower  recognition  in 
some  categories,  for  example  “cats,”  when  the  cat  is  seated.  Overall,  recogniz¬ 
ing  point-clouds  is  signihcantly  more  challenging  than  working  with  meshes, 
while  we  still  obtain  state-of-the-art  results  when  compared  to  mesh-based 
approaches. 


In  Table  3,  values  of  best  match,  first  tier,  and  second  tier  are  presented  for  the 
databases  ISDB  and  PSB  and  their  combination  (Total),  for  four  methods:  DCRGcDP, 
CDF,  Light  Field  Descriptor  (LFD),  and  Spherical  Harmonics  (SH),  based  on  the 
results  reported  in  [14].  Light  Field  Descriptor  (LFD)  and  Spherical  Harmonics  (SH) 
are  two  out  of  the  three  top  performing  descriptors  for  PSB  as  described  in  [35]. 
Among  all  the  four  methods,  we  have  the  second  best  overall  results,  considering 
the  best  match.  As  discussed  above,  the  tiers  corresponding  to  PSB  database  with 
larger  amount  of  intraclass  variability  are  lower  than  the  tiers  corresponding  to  the 
ISDB  database.  Recall  that  our  results  are  for  the  more  challenging  point  cloud  3D 
data  representation,  while  the  other  algorithms  are  reported  on  meshes. 
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Best 

Match 

First 

Tier 

Second 

Tier 

Planes  (29) 

DCRGcDP 

76% 

26% 

37% 

CDF 

45% 

25% 

44% 

Humans  (134) 

DCRGcDP 

99% 

60% 

89% 

CDF 

88% 

57% 

84% 

Horses  (16) 

DCRGcDP 

88% 

64% 

73% 

CDF 

94% 

68% 

85% 

Hands  (33) 

DCRGcDP 

85% 

48% 

58% 

CDF 

82% 

67% 

76% 

Insects  (20) 

DCRGcDP 

60% 

17% 

24% 

CDF 

71% 

23% 

34% 

Chairs  (33) 

DCRGcDP 

58% 

18% 

30% 

CDF 

45% 

20% 

34% 

Ships  (21) 

DCRGcDP 

67% 

19% 

25% 

CDF 

24% 

11% 

20% 

Guns  (7) 

DCRGcDP 

71% 

29% 

35% 

CDF 

71% 

40% 

50% 

Furniture  (19) 

DCRGcDP 

89% 

37% 

56% 

CDF 

63% 

NA 

NA 

Fishes  (26) 

DCRGcDP 

81% 

37% 

48% 

CDF 

65% 

NA 

NA 

Birds  (20) 

DCRGcDP 

40% 

18% 

23% 

CDF 

30% 

NA 

NA 

Total  (487) 

DCRGcDP 

76% 

38% 

53% 

CDF 

71% 

45% 

63% 

Table  2 

Recognition  results  for  both  DCRGcDP  and  CDF  matehing  methods  for  some  of 
the  3D  objeet  categories,  where  the  reeognition  is  among  all  the  shapes  in  PSB  and 
ISDB  datasets  used  in  [14]- 


15 


V"  V V-  ,  <  '' 


t 

«r 


«i  ^ 

» 


i4l 

H  M>  tf 


'  /-m 


Fig.  4.  Results  of  shape  retrieval  for  the  global  recognition  algorithm  using  all  six 
histogram-based  signatures.  The  first  column  on  the  left  shows  the  query  models,  and 
the  other  figures  on  each  row  show  the  top  eight  matches.  (This  is  a  color  figure.) 

5  Discussions,  Local  Analysis,  and  Conclusions 


In  this  paper,  we  introduced  a  new  framework  for  3D  object  recognition  from  point 
cloud  data.  The  proposed  3D  signatures  are  derived  from  the  distribution  of  the 
pairwise  diffusion  distances,  the  distribution  of  the  pairwise  geodesic  distances,  the 
distribution  of  the  ratio  between  these  two  distances,  the  distribution  of  a  centrality 
measure,  the  distribution  of  a  curvature  classifier,  and  the  distribution  of  a  curvature 
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Total  (487) 

ISDB  (106) 

PSB  (381) 

BM 

FT 

ST 

BM 

FT 

ST 

BM 

FT 

ST 

DCRGcDP 

76% 

38% 

53% 

95% 

70% 

79% 

71% 

37% 

51% 

CDF 

71% 

45% 

63% 

100% 

98% 

100% 

65% 

40% 

58% 

LFD 

79% 

42% 

59% 

73% 

44% 

62% 

87% 

47% 

62% 

SH 

75% 

37% 

54% 

78% 

47% 

64% 

77% 

41% 

57% 

Table  3 

The  overall  results  of  the  four  methods,  DCRGcDP,  LTD,  SH,  and  CDF,  tested  on 
ISDB  and  PSB  databases  and  their  combination  (Total)  is  presented. 

weighted  distance.  The  use  of  intrinsic  distances  and  their  distributions  is  supported 
by  theoretical  work  as  well  as  by  extensive  experimental  results  in  both  the  3D  shape 
recognition  and  image  analysis  literature.  Although  the  distribution  of  geodesic 
distances  has  been  used  before  for  3D  recognition  of  triangulated  surfaces  (not 
point  clouds  as  here  reported),  the  other  signatures  have  not  been  incorporated  in 
prior  art. 

Since  the  information  in  the  signatures  (histograms)  defined  on  the  whole  shape 
is  global,  it  might  ignore  some  important  local  information  for  identification.  It  is 
thereby  reasonable  to  compute  the  signatures  more  locally.  In  addition,  in  practical 
scenarios  where  occlusions  (or  partial  acquisition)  are  present,  there  is  a  need  for 
more  local  signatures.  We  extend  the  global  framework  to  (semi-)local  recognition 
by  considering  overlapping  patches  (similar  to  the  idea  in  [25]).  Patches,  originally, 
are  50  sets  of  the  300  closest,  in  the  geodesic  sense,  points  to  50  center  points, 
sampled  from  the  shape  by  the  maxmin  sampling  method  [8].  Then,  all  the  patches 
with  more  than  70%  overlap  are  joined  as  one  patch.  These  patches  become  nodes 
in  a  graph,  with  attributes  given  by  the  six  histograms  described  in  Section  3.1, 
and  edges  encoding  the  spatial  relationship  between  the  patches  (connecting  the 
nodes  corresponding  to  two  neighboring  patches) .  The  edge  weights  are  the  geodesic 
distances  between  the  two  corresponding  center  points,  computed  on  the  whole 
shape.  Then,  we  apply  a  graph  comparison  algorithm,  following  in  part  the  work 
introduced  in  [29]  for  shape  recognition  in  video.  We  have  applied  this  method 
over  a  dataset  of  119  shapes  from  the  Princeton  Shape  Benchmark  (PSB),  [35], 
and  SCAPE  pose  and  body  shapes  data  [1],  and  the  preliminary  overall  obtained 
results  where  comparable  to  the  global  point  cloud  3D  shape  recognition  method 
introduced  in  this  paper.  In  categories  such  as  tables,  human  hands,  and  insects,  the 
graph  method  produced  better  results;  while  for  cars,  planes,  and  horses,  the  global 
method  lead  to  better  results.  One  of  our  ongoing  objectives  is  to  further  improve 
the  graph  comparison  method  and  to  use  it  in  partial  matching  applications. 

We  are  also  considering  combining  the  framework  here  proposed  with  topological 
techniques,  e.g.,  [36],  in  particular  to  address  diverse  classes  such  as  chairs. 

We  have  started  to  experiment  with  more  advanced  classification  methods  from  the 
learning  community,  applying  them  to  our  signatures,  e.g.,  SVM,  which  have  been 
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very  successfully  used  in  the  image  recognition  literature.  Preliminary  results  are 
encouraging,  since  straightforward  use  of  SVM  produces  similar  results  to  the 
metric.  Results  in  all  these  direction  will  be  reported  elsewhere. 
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